Model selection is central to statistics, and many learning problems can beformulated as model selection problems. In this paper, we treat the problem ofselecting a maximum entropy model given various feature subsets and theirmoments, as a model selection problem, and present a minimum description length(MDL) formulation to solve this problem. For this, we derive normalized maximumlikelihood (NML) codelength for these models. Furthermore, we prove that theminimax entropy principle is a special case of maximum entropy model selection,where one assumes that complexity of all the models are equal. We apply ourapproach to gene selection problem and present simulation results.
展开▼